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Abstract 

Natural  sounds  are  broadband  and  dynamic.  To  understand  their  encoding  in  primary 
auditory  cortex  (AI),  we  have  characterized  the  responses  of  units  in  AI  with  elementary 
versions  of  such  spectra — moving  ripples.  Ripples  are  broadband  sounds  with  a  sinusoidal 
envelope  along  the  log  frequency  axis,  that  move  up  or  down  with  a  constant  velocity. 
Speech  spectra  can  be  decomposed  into  a  superposition  of  ripples  with  different  densities 
and  velocities. 

If  AI  units  are  linear,  then  it  is  possible  to  predict  how  a  unit  responds  to  any  broadband 
dynamic  stimulus  by  first  measuring  its  responses  to  all  elementary  ripples  (i.e.,  measure  the 
ripple  transfer  function),  and  then  superposing  the  responses  to  these  ripples,  each  according 
to  its  weight  in  the  input.  We  have  successfully  demonstrated  the  linearity  of  AI  units  in  the 
past  using  ripples  either  stationary  or  moving  only  downward  in  frequency.  The  data 
described  in  this  poster  will  show  that  transfer  functions  are  also  separable  for  up-moving 
ripples,  but  that  the  two  transfer  functions  may  well  be  different.  Hence  AI  units  are  not 
always  fully  separable,  but  only  separable  by  quadrant.  We  shall  discuss  the  implications  of 
these  results  and  show  examples  of  predicted  and  measured  responses  to  speech. 
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Summary 

Question:  How  is  timbre  encoded 
in  primary  auditory  cortex? 

Important  Concepts: 

•  Response  Field  (RF):  range  of  frequencies  that  influence  a  neuron.  RF  is  a  function  of  time. 

•  Ripples :  broadband  sounds  with  sinusoidally  modulated  spectral  envelope. 

•  Data  analysis  based  on  linear  systems ;  by  varying  ripple  frequency  and  velocity,  we  measure 
the  transfer  function.  The  inverse  Fourier  transform  gives  the  spectro-temporal  RF  (STRF). 

We  show  predictions  of  single-unit  responses  to  complex  spectra,  including: 

•  Linearity  of  responses  to  dynamic  ripples:  responses  to  upward  and  downward  moving 
ripples  can  be  superimposed  to  predict  responses  to  arbitrary  combinations. 

•  Separability  of  spectral  and  temporal  measurements  of  the  responses:  spectral  properties  can 
be  measured  independently  of  temporal  properties  in  some  cases. 

We  find: 

•  Cells  can  be  characterized  by  an  STRF,  separable  or  non- separable. 

•  Cells  behave  like  a  linear  system:  when  presented  with  a  sum  of  several  profiles,  the 
response  is  the  sum  of  the  responses  to  the  individual  profiles. 

We  conclude  that  the  combined  spectro-temporal  decomposition  in  AI  is  an  affine  wavelet 
transformation  of  the  input,  in  concert  with  a  similar  temporal  decomposition.  The  auditory 
profile  is  the  result  of  a  multistage  process  which  occurs  early  in  the  pathway.  This  pattern 
is  projected  centrally  where  a  multiscale  representation  is  generated  in  AI  by  STRFs  with  a 
range  of  widths,  asymmetries,  BFs,  time  lags  and  directional  sensitivities. 


Natural  sound 

•  loudness  (-intensity) 

•  pitch  (~  tonal  height) 

•  timbre  (the  rest,  e.g. 
spectral  envelope) 
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Spectro-Temporal  Transform 


Frequencies  are  mapped  along  the 
cochlea  on  a  log  (frequency)  axis. 
Since  natural  sounds  are  dynamic,  we 
need  a  time  axis. 

Therefore  we  use  two-dimensional 
functions  of  log(frequency)  and  time. 

Spectrogram  envelope  of  a 
speech  fragment 6 water  all  year’ 

x  =  log/ 


Consider  the  (Fourier)  space  dual  to  the 
two-dimensional  spectro-temporal  space. 
For  linear  systems,  the  spectro-temporal 
domain  and  its  Fourier  domain  are 
equivalent.  Analysis  is  often  conceptually 
simpler  in  the  Fourier  domain. 

Fourier  transform  of  the 
envelope  of  the  spectrogram 


Fourier  Transform 
[.]  exp(±2KjQ.x±2njwt) 
Inverse  Transform 

w  =  “ripple  velocity” 
Q  =  “ripple  frequency’ 
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Real  functions  in  the  spectro-temporal  domain  give  rise  to 
complex  conjugate  symmetric  functions  in  the  Fourier  domain. 
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Spectro-Temporal  Response 

•  The  spectro-temporal  response  field  of  a  neuron  is  the  usual  response  field  made  time- 
dependent.  Equivalently,  it  is  the  temporal  impulse  response  for  each  frequency. 

•  Its  Fourier  Transform  is  the  transfer  function. 

•  Either  can  be  used  to  predict  the  response  to  any  broadband  dynamic  sound. 

Spectro-Temporal  Response  2  Dimensional  Transfer 

Function  (STRF)  of  a  neuron  Function  of  the  same  neuron 


iOf  =  log/ 


Fourier  Transform 


|  [.]  exp(+2nj£hc±2Kjwt) 

Inverse  Transform 
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Ripple  decomposition 
of  a  broadband 
dynamic  sound 

(A)  The  envelope  of 
a  speech  fragment  is 
Fourier  transformed 
in  (B).  The  Fourier 
transform  is  then 
approximated  by  its 
100  largest 
components  in  (C) 
and  then  inverted 
back  in  (D),  giving 
an  excellent 
approximation  to  the 
original  envelope. 


Ripple  Decomposition 

Spectrogram  (log  frequency) 


“ water  all  year ” 
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Ripple  Transform  (100  peaks) 
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The  Ripple  Stimulus 

Ripples  are  broadband  sounds  with  sinusoidally  modulated  spectral  envelope  along 
the  log  (frequency)  axis,  analogous  to  visual  gratings. 


Ripple  in  Spectro-Temporal 
Space  (Spectrogram) 
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The  Fourier  transform  of  a  “ripple 
has  support  only  on  a  single  point 
(and  its  complex  conjugate). 


Ripple  in  Fourier  Space 
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Measurements  by  Ripple  Frequency 

.A. 


0  170  340  510  680  850  1020  1190  1360  1530  1700 


Ripple  Frequency  (cyc/oct) 


Spike  events  in  (A)  are  turned  into  period 
histograms  in  (B).  The  amplitudes  and 
phases  give  the  tranfer  function  in  (C), 
which  can  be  inverse  Fourier  transformed 
to  give  Response  Fields  in  (D). 
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Spectro-Temporal  Response  Fields 


Examples  of  Experimentally  obtained  STRFs 
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Separable 


Non- separable 


Note  the  variety  of  spectral  and  temporal  behaviors 
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Quadrant  Separability 

An  STRF  can  fall  into  one  of  Spectro-Temporal  Domain 

three  categories: 


•  Non-separable:  The  transfer 
function  is  an  arbitrary 
(complex-conjugate  symmetric) 
function  of  ripple  frequency  and 
ripple  velocity. 


X 


Non-separable 


2D  Fourier 
Transform 


Fourier  Domain 


•  Quadrant  separable:  The 

transfer  function  within  each 
quadrant  is  a  product  of  a  func¬ 
tion  of  ripple  frequency  and  a 
function  of  ripple  velocity.  The 
envelope  of  the  STRF  is  a  sim¬ 
ple  product  of  a  function  of 
spectrum  and  a  function  of  time. 


•  Fully  separable:  The  transfer 
function  is  the  product  of  a  func¬ 
tion  of  ripple  freqency  and  rip¬ 
ple  velocity  everywhere.  The 
resulting  STRF  is  a  product  of  a 
function  of  spectrum  and  a  func¬ 
tion  of  time. 


Center  for  Auditory 
and  Acoustic  Research 


Institute  for  9/stems  Research 
University  of  Maryland 


Linearity  in  Theory 


Assuming  linearity,  the  STRF  predicts  the  response  to  any  broadband  dynamic  stimulus, 
including  single  ripples  moving  in  either  direction  (first  two  rows)  and  combinations  of 
upward  and  downward  moving  ripples. 
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Linearity  in  Practice 

The  correlation  between  predicted  and  actual  response  is  quite  good  for 
most  cells.  Since  cells  cannot  fire  at  negative  rates,  any  prediction  should 
be  half-wave  rectified  before  comparing  to  the  actual  response. 


Stimulus  Spectrogram  STRF  Response 
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Cortical  Filter  Model 


•  Response  fields  in  AI  have  char¬ 
acteristic  shapes  both  spectrally 
and  temporally. 

•  AI  cells  respond  well  only  to  a 
small  set  of  moving  ripples 
around  a  particular  spectral  peak 
spacing  and  velocity. 

•  We  find  cortical  cells  with  all  cen¬ 
ter  frequencies,  spectral 
symmetries,  bandwidths,  latencies 
and  temporal  impulse  response 
symmetries. 

•  Therefore  AI  decomposes  the 
input  spectrum  into  different  spec¬ 
trally  and  temporally  tuned 
channels. 

•  Equivalently,  a  population  of 
cells,  tuned  around  different  mov¬ 
ing  ripple  parameters,  can  effec¬ 
tively  represent  the  input  spec¬ 
trum  at  multiple  scales. 
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Theoretical  ripple  filters  used  to  generate  a 
‘cortical  representation’ 
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The  Cortical  Representation 

Spectrally  narrow  cells  pick  out  the  fine  features  of  the  spectral  profile,  whereas  broadly  tuned 
cells  pick  out  the  coarse  outlines  of  the  spectrum.  Similarly,  dynamically  sluggish  cells  will 
respond  to  the  slow  changes  in  the  spectrum,  whereas  fast  cells  respond  to  rapid  onsets  and 
transitions.  In  this  manner,  AI  is  able  to  encode  multiple  views  of  the  same  dynamic  spectrum. 
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